An Ensemble Approach to Corpus Based Word Sense Disambiguation

نویسنده

  • Ted Pedersen
چکیده

This paper presents a corpus{based approach to word sense disambiguation that combines a number of Naive Bayesian classiers into an ensemble that performs disambiguation via a majority vote. Each of the member classiers is based on collocation and co{occurrence features found in varying sized windows of context. This approach is motivated by the observation that, in general, enhancing the feature set or learning algorithm used by a corpus{ based approach does not improve disambiguation accuracy beyond what can be attained with shallow lexical features and the Naive Bayesian classier. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves levels of accuracy comparable to the best previously published results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simple Approach to Building Ensembles of Naive Bayesian Classi ers for Word Sense Disambiguation

This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classi ers, each of which is based on lexical features that represent co{occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves acc...

متن کامل

A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation

This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classifiers, each of which is based on lexical features that represent co-occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves ac...

متن کامل

A Decision Tree of Bigrams is an Accurate Predictor of Word Sense

This paper presents a corpus-based approach to word sense disambiguation where a decision tree assigns a sense to an ambiguous word based on the bigrams that occur nearby. This approach is evaluated using the sense-tagged corpora from the 1998 SENSEVAL word sense disambiguation exercise. It is more accurate than the average results reported for 30 of 36 words, and is more accurate than the best...

متن کامل

Korean Word-Sense Disambiguation Using Parallel Corpus as Additional Resource

Most previous research on Korean WordSense Disambiguation (WSD) were focusing on unsupervised corpus-based or knowledge-based approach because they suffered from lack of sense-tagged Korean corpora.Recently, along with great effort of constructing sense-tagged Korean corpus by government and researchers, finding appropriate features for supervised learning approach and improving its prediction ...

متن کامل

Lexical Semantic Ambiguity Resolution with Bigram-Based Decision Trees

This paper presents a corpus-based approach to word sense disambiguation where a decision tree assigns a sense to an ambiguous word based on the bigrams that occur nearby. This approach is evaluated using the sense-tagged corpora from the 1998 SENSEVAL word sense disambiguation exercise. It is more accurate than the average results reported for 30 of 36 words, and is more accurate than the best...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000